Results

In the context of Bolivia, similar to many other developing nations, the availability of data pertaining to GDP or other economic activity indicators is constrained by various limitations such as delayed publication, inadequate disaggregation, and low frequency. Despite the existence of quarterly GDP time series and the monthly Global Index of Economic Activity (IGAE, for its Spanish acronym) in Bolivia, they are typically released with a delay of three to six months.

In response to these constraints, this article proposes a monthly GDP nowcast indicator for the Bolivian economy. The terminology of Giannone et al. (2008) and Banbura et al. (2013) is assumed to define nowcasting as “forecasting values of a time series not published by official sources for the current period”.1

The findings demonstrate that the Bolivian economy is projected to have expanded by 3.23% by the conclusion of 2022. Despite the indicator of monthly economic activity, namely the IGAE, displaying a cumulative growth of 4.3% until September, there has been a deceleration in overall economic activity since October. This slowdown can be largely attributed to the partial cessation of economic activity resulting from the civil strikes originated in the department of Santa Cruz, which is one of the regions that significantly contributes to national production.2

The present analysis indicates that the recent social conflicts have had a discernible impact on economic activity. Specifically, the data suggests that there was a modest growth of 0.4% in October followed by a notable decline of 2% in November, when compared to analogous periods in the preceding year.

Methodologically, machine learning algorithms have been effectively utilized to nowcast Bolivia’s monthly economic activity. The use of machine learning techniques has enabled the identification of patterns and the extraction of meaningful insights from complex datasets. The results of this approach have led to more accurate and timely projections of economic activity, which is an invaluable tool for decision-making and resource allocation. By leveraging the computational power of machine learning, economists and researchers can obtain more reliable and robust predictions that contribute to a better understanding of current economic conditions and trends.

Methodology

The current monthly GDP nowcast for Bolivia has been derived through a rigorous process of machine learning. Specifically, the forecast is an average of three distinct algorithms: Gradient Boosting Regressor, Ada Boost Regressor, and Random Forest Regression. These algorithms have been selected after undergoing a k-fold cross-validation process, which involved testing various other algorithms. The forecasts are based on the monthly Global Index of Economic Activity (IGAE).

Training, validation and test sets

The study aims to predict the monthly Global Index of Economic Activity, \(y\), using approximately 50 monthly variables, \(\mathbf{X}\), as potential predictors. These variables include current and lagged economic indicators published by the National Institute of Statistics of Bolivia, export and import data, indicators of the financial, fiscal and monetary system, and variables on domestic and commodity prices. The sample period ranges from January 2007 to September 2022.

To ensure the robustness of the predictive models and select the most suitable algorithms, the sample was divided into three subsamples: a training set, a validation set, and a test set. The training set covers the period from 2007M1 to 2017M12, the validation set comprises the time interval from 2018M1 to 2022M9, and the test set (i.e., nowcast period) ranges from 2022M10 to 2022M12. All variables are z-score normalized to ensure comparability and avoid bias. Specifically, the input values are adjusted according to the formula provided. \[x^{(i)}_j = \dfrac{x^{(i)}_j - \mu_j}{\sigma_j} \tag{4}\] where \(j\) selects a variable or a column in the \(\mathbf{X}\) matrix. \(µ_j\) is the mean of all the values for variable \(j\) and \(\sigma_j\) is the standard deviation of variable \(j\) from the training set.

Model Selection Process

The use of machine learning algorithms in the nowcasting of GDP has gained popularity due to their superior predictive power when compared to traditional statistical models. However, given the diverse range of machine learning algorithms that could be suitable for this purpose, a k-fold cross-validation process is implemented to identify the most appropriate ones.

K-fold cross-validation is a widely accepted technique for evaluating the predictive performance of machine learning algorithms. The process involves partitioning the dataset into k equally sized subsets or “folds”. One of the folds is reserved for validation, while the remaining k-1 folds are utilized for algorithm training. This procedure is repeated k times, with each iteration selecting a different fold for validation and using the other k-1 folds for training. The results of each iteration are subsequently averaged to obtain an overall performance metric, such as accuracy or mean squared error. This method helps to mitigate the bias that may arise from testing the algorithm’s performance on a specific dataset, which can lead to overfitting or underfitting.

In this context, the predictive capacity of the following machine learning algorithms is evaluated using k-fold cross-validation (with a value of k=10), providing a more comprehensive assessment of their effectiveness.

  • Linear Regression
  • Lasso
  • ElasticNet Regression
  • Ridge Regression:
  • K Neighbors Regressor
  • Decision Tree Regressor
  • Simple Vector Regression
  • Ada Boost Regressor
  • Gradient Boosting Regressor
  • Random Forest Regression
  • Extra Trees Regressor

The findings indicate that the Gradient Boosting Regressor, Ada Boost Regressor, and Random Forest Regression exhibit the lowest negative mean squared errors, thus rendering them the most appropriate algorithms for forecasting IGAE.

Models comparison: Negative Mean Squared Error distribution by algorithm
Model Mean SD
Linear -0.14 0.06
Lasso -1.00 0.13
ElasticNet -0.48 0.08
Ridge -0.09 0.04
Bayesian Ridge -0.07 0.03
KNN -0.10 0.04
Decision Tree -0.10 0.08
SVR -0.18 0.10
AdaBoost -0.05 0.03
Gradient Boost -0.05 0.04
Random Forest -0.05 0.04

During the training period, the subsequent graph presents a comparison between the IGAE observations and the selected algorithms’ predictions, indicating a remarkable similarity between the two.

Finally, the average of the forecasts derived from the Gradient Boosting Regressor, Ada Boost Regressor, and Random Forest Regression algorithms serves as the ultimate nowcast metric.


  1. Banbura, M., Giannone, D., Modugno, M. & Reichlin, L. (2013). Now-casting and the realtime data flow. Handbook of economic forecasting (pp. 195-237). Elsevier.

    Giannone, D., Reichlin, L. & Small, D. (2008). Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics, 55(4), 665-676. https://doi.org/10.1016/j.jmoneco.2008.05.010↩︎

  2. News regarding the civic strikes that occurred in Santa Cruz during the months of October and November 2022 can be accessed in link1 and link2.↩︎